misMM: An Integrated Pipeline for Misassembly Detection Using Genotyping-by-Sequencing and Its Validation with BAC End Library Sequences and Gene Synteny
نویسندگان
چکیده
As next-generation sequencing technologies have advanced, enormous amounts of whole-genome sequence information in various species have been released. However, it is still difficult to assemble the whole genome precisely, due to inherent limitations of short-read sequencing technologies. In particular, the complexities of plants are incomparable to those of microorganisms or animals because of whole-genome duplications, repeat insertions, and Numt insertions, etc. In this study, we describe a new method for detecting misassembly sequence regions of Brassica rapa with genotyping-by-sequencing, followed by MadMapper clustering. The misassembly candidate regions were cross-checked with BAC clone paired-ends library sequences that have been mapped to the reference genome. The results were further verified with gene synteny relations between Brassica rapa and Arabidopsis thaliana. We conclude that this method will help detect misassembly regions and be applicable to incompletely assembled reference genomes from a variety of species.
منابع مشابه
A Physical Map of the Short Arm of Wheat Chromosome 1A
Bread wheat (Triticum aestivum) has a large and highly repetitive genome which poses major technical challenges for its study. To aid map-based cloning and future genome sequencing projects, we constructed a BAC-based physical map of the short arm of wheat chromosome 1A (1AS). From the assembly of 25,918 high information content (HICF) fingerprints from a 1AS-specific BAC library, 715 physical ...
متن کاملA second-generation integrated map of the silkworm reveals synteny and conserved gene order between lepidopteran insects.
A second-generation linkage map was constructed for the silkworm, Bombyx mori, focusing on mapping Bombyx sequences appearing in public nucleotide databases and bacterial artificial chromosome (BAC) contigs. A total of 874 BAC contigs containing 5067 clones (22% of the library) were constructed by PCR-based screening with sequence-tagged sites (STSs) derived from whole-genome shotgun (WGS) sequ...
متن کاملUtilization of Super BAC Pools and Fluidigm Access Array Platform for High-Throughput BAC Clone Identification: Proof of Concept
Bacterial artificial chromosome (BAC) libraries are critical for identifying full-length genomic sequences, correlating genetic and physical maps, and comparative genomics. Here we describe the utilization of the Fluidigm access array genotyping system in conjunction with KASPar genotyping technology to identify individual BAC clones corresponding to specific single-nucleotide polymorphisms (SN...
متن کاملGenotyping of Intron 22 and Intron 1 Inversions of Factor VIII Gene Using an Inverse-Shifting PCR Method in an Iranian Family with Severe Haemophilia A
Abstract Background: Haemophilia A (HA) is an X-linked bleeding disorder caused by the absence or reduced activity of coagulation factor VIII (FVIII). Coagulation factors are a group of related proteins that are essential for the formation of blood clots. The aim of this study was to genotype the coagulation factor VIII gene mutations using Inverse Shifting PCR (IS-PCR) in an Iranian family ...
متن کاملConstruction, characterization and chromosomal mapping of bacterial arti¢cial chromosome (BAC) library of Yunnan snub-nosed monkey (Rhinopithecus bieti )
We constructed a high redundancy bacterial artificial chromosome library of a seriously endangered Old World Monkey, the Yunnan snub-nosed monkey (Rhinopithecus bieti) from China. This library contains a total of 136 320 BAC clones. The average insert size of BAC clones was estimated to be 148 kb. The percentage of small inserts (50–100 kb) is 2.74%, and only 2.67% non-recombinant clones were o...
متن کامل